Anomaly Detection in Video Using Predictive Convolutional Long Short-Term Memory Networks
نویسندگان
چکیده
Automating the detection of anomalous events within long video sequences is challenging due to the ambiguity of how such events are defined. We approach the problem by learning generative models that can identify anomalies in videos using limited supervision. We propose end-to-end trainable composite Convolutional Long Short-Term Memory (Conv-LSTM) networks that are able to predict the evolution of a video sequence from a small number of input frames. Regularity scores are derived from the reconstruction errors of a set of predictions with abnormal video sequences yielding lower regularity scores as they diverge further from the actual sequence over time. The models utilize a composite structure and examine the effects of 'conditioning' in learning more meaningful representations. The best model is chosen based on the reconstruction and prediction accuracies. The Conv-LSTM models are evaluated both qualitatively and quantitatively, demonstrating competitive results on anomaly detection datasets. Conv-LSTM units are shown to be an effective tool for modeling and predicting video sequences.
منابع مشابه
Speech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملConvolutional Residual Memory Networks
Very deep convolutional neural networks (CNNs) yield state of the art results on a wide variety of visual recognition problems. A number of state of the the art methods for image recognition are based on networks with well over 100 layers and the performance vs. depth trend is moving towards networks in excess of 1000 layers. In such extremely deep architectures the vanishing or exploding gradi...
متن کاملTracking of Humans in Video Stream Using LSTM Recurrent Neural Network
In this master thesis, the problem of tracking humans in video streams by using Deep Learning is examined. We use spatially supervised recurrent convolutional neural networks for visual human tracking. In this method, the recurrent convolutional network uses both the history of locations and the visual features from the deep neural networks. This method is used for tracking, based on the detect...
متن کاملAnomaly Detection for Temporal Data using Long Short-Term Memory (LSTM)
We explore the use of Long short-term memory (LSTM) for anomaly detection in temporal data. Due to the challenges in obtaining labeled anomaly datasets, an unsupervised approach is employed. We train recurrent neural networks (RNNs) with LSTM units to learn the normal time series patterns and predict future values. The resulting prediction errors are modeled to give anomaly scores. We investiga...
متن کاملThe University of Passau Open Emotion Recognition System for the Multimodal Emotion Challenge
This paper presents the University of Passau’s approaches for the Multimodal Emotion Recognition Challenge 2016. For audio signals, we exploit Bag-of-Audio-Words techniques combining Extreme Learning Machines and Hierarchical Extreme Learning Machines. For video signals, we use not only the information from the cropped face of a video frame, but also the broader contextual information from the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1612.00390 شماره
صفحات -
تاریخ انتشار 2016